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INTRODUCTION 


Investigations  of  adverse  environmental  effects  almost  exclusively 
employ  repeated  measures  of  subjects  (Kennedy  &  Bittner,  1977).  The  general 
approach  in  such  studies  is  to  collect  data  on  one  or  more  trials  conducted 
Before,  During,  and  After  exposure.  Interest  in  time-course  effects  fre¬ 
quently  dictates  a  Before-Durlng-Af ter  (BDA)  experiment,  but  financial  and 
statistical  arguments  for  economy  can  also  be  made  for  this  paradigm  (Campbell 
6t  Stanley,  1963;  Winer,  1971).  In  addition,  shortages  of  qualified  research 
subjects  and  simulator  space  can  make  experiments  with  independent  groups 
infeasible.  However,  the  most  potent  argument  for  repeated  measures  often 
rests  on  the  requirement  to  minimize  subject  risk  in  hazardous  environments. 
This  argument  presumes  that  with  tasks  which  are  suitably  stable  for 
repeated  measurements  the  number  of  trials  and  thus,  subject  exposure,  will 
be  reduced  because  individual  differences  may  be  removed  from  the  error 
variance.  Clearly,  statistical  suitability  should  be  sought  before  a  task 
is  used  in  a  BDA  experiment  where  risk  minimization  is  a  serious  consider¬ 
ation.  A  program  to  evaluate  the  suitability  of  performance  tests  for 
repeated  measures  and  to  develop  methodologies  for  test  applications  has 
been  underway  for  the  last  four  years  (Kennedy,  &  Bittner,  1977;  Carter, 
Kennedy,  &  Bittner,  1980;  Kennedy,  Carter,  &  Bittner,  1980;  Bittner, 

1981a;  Bittner  &  Carter,  1981;  Guignard,  Bittner,  &  Carter,  1981).  This 
investigation  is  directed  at  the  baseline  evaluation  of  tasks  drawn  from 
the  Moran  Battery  (Moran  &  Mefferd,  1959)  and  the  Carter  and  Sbisa  Computer 
Generated  Battery  (1981)  as  part  of  this  program. 

According  to  Jones  (1972,  1980),  candidate  tests  for  repeated  measures 
studies  should  meet  rigorous  statistical  qualifications.  Meaningful  repeated 
measurements  generally  require  that  the  means  variances,  and  in ter trial  cor¬ 
relations  of  a  test  be  well-behaved  when  obtained  under  constant  conditions. 
Specific  statistical  characteristics  considered  necessary  are  as  follows: 

(1)  the  means  change  in  a  linear  manner  or  are  unchanging  over  trials;  (2) 
variances  are  homogeneous  over  trials;  and  (3)  inter trial  correlations  are 
differentially  stable.  Pertinently,  the  criteria  for  the  means  follows  the 
considerations  of  Campbell  and  Stanley  (1963,  p.  38),  and  those  for  variances 
and  correlations  are  implied  by  traditional  assumptions  for  related  measures 
analysis  of  variance  (e.g.,  Scheffd,  1959;  Winer,  1971).  Constancy  of  cor¬ 
relation,  in  addition  to  being  a  traditional  assumption,  assures  that  the 
same  attribute  is  being  measured  on  each  occasion  of  measurement  (Alvares 
&  Hulin,  1972).  Without  such  constancy,  attribution  of  effect  and  scien¬ 
tific  generalization  may  be  impossible  (Bittner,  1979;  Jones,  Kennedy,  & 
Bittner,  1981).  The  present  investigation  was  directed  at  determining  when, 
if  ever,  in  practice  tasks  obtain  the  desired  statistical  characteristics 
under  baseline  conditions. 


This  work  was  funded  by  the  Naval  Medical  Research  and  Development  Com¬ 
mand  and  was  performed  under  Navy  Work  Unit  No.  MF58. 524. 002-5027 .  The 
research  program  was  identified  as  the  Performance  Evaluation  Tests  for 
Environmental  Research  (PETER)  Program  in  earlier  reports.  The  opinions 
are  those  of  the  authors,  and  do  not  necessarily  reflect  those  of  the 
Department  of  the  Navy.  Requests  for  reprints  may  be  sent  to  Dr,  Alvah 
C.  Bittner,  Jr.,  Naval  Biodynamics  Laboratory,  Box  29407,  New  Orleans,  LA 
70189. 
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Several  investigations  are  relevant  to  the  present  study*  Horn  (1972), 
has  reported  factor  analyses  which  included  three  cognitive  tasks  from  the 
Moran  battery  in  a  monumental  investigation  of  16  tasks  administered  daily 
for  10  days  to  106  subjects.  These  analyses  were  designed  to  separate 
"trait"  from  "state"  contributions  to  task  variation.  Horn  identified  Per¬ 
ceptual  Speed,  Visualization,  and  FlejkLbility  of  Closure  as  having  state 
variation.  However,  state  contributions  were  typically  smaller  than  trait 
contributions.  Unfortunately,  the  Horn  (1972)  results  were  not  reported 
in  sufficient  detail  to  permit  mean,  variance,  and  differential  stability 
analyses  of  the  types  identified  above.  More  recent  investigations  in  this 
laboratory  have  reported  results  of  stability  analyses  for  tasks  similar  to 
those  included  in  the  Computer  Battery  (Seales,  Kennedy,  &  Bittner,  1980; 
Carter,  Kennedy,  &  Bittner,  1981).  Seales,  et  al .  (1980)  reported  that  a 
10  minute  arithmetic  test  (composed  of  successive  addition,  subtraction, 
multiplication,  and  division  sub tasks)  possessed  mean,  variance,  and  dif¬ 
ferential  stability  from  the  first  day  of  a  13  day  study.  The  average 
correlation  across  days  was  £  *  0.94  for  the  total  correct  score.  More 
recently  Carter,  et  al.  (1981)  have  reported  that  the  Grammatical  Reasoning 
Task  (Baddeley,  1968)  met  all  stability  criteria  after  only  four  dally  (60 
second)  administrations;  with  a  reliability  of  £  *  0.82  across  the  differ¬ 
entially  stabilized  trials.  The  Horn  (1972),  Seales,  et  al.  (1980),  and 
Carter,  et  al.  (1981)  results  encouraged  the  present  multi-task  investigation. 

One  purpose  of  this  investigaton  was  to  evaluate  the  statistical 
characteristics  of  tasks  drawn  from  the  Moran  (Moran  &  Mefferd,  1959)  and 
Computer  Batteries  (Carter  &  Sbisa,  1981).  A  second  purpose  was  to  ex¬ 
plore  the  relationships  between  tasks  subsequent  to  their  becoming  differ¬ 
entially  stable. 


METHOD 


The  approach  in  this  investigation  was  to  conduct  two  sequential 
experiments  with  each  directed  at  a  specific  battery.  In  the  first  exper¬ 
iment,  tasks  from  the  Moran  Battery  were  studied  and,  in  the  second,  tasks 
from  the  Computer  Battery  were  investigated .  The  two  experiments  are  des¬ 
cribed  sequentially  in  the  following  sections. 

Experiment  1:  Moran  Battery 


Tasks 


The  Moran  Battery  employed  in  this  study  consisted  of  five  simple  paper- 
and-pencil  tests  which  were  constructed  to  follow  the  format  in  French’s 
(1954)  kit  of  reference  aptitude  and  achievement  factors  (Moran  &  Mefford, 
1959;  Moran,  Kimble,  &  Mefferd,  1%4).  Twenty  alternate  forms  for  each  task 
with  accompanying  instruction  sheets  and  practice  problems  are  available. 

The  tasks  measured  included:  Flexibility  of  Closure  (FC),  Number  Facility 
(NF) ,  Perceptual  Speed  (PS),  Speed  of  Closure  (SC),  and  Visualization  (V). 
Copies  of  the  alternate  forms  were  obtained  from  Moran. 

Flexibility  of  Closure  (FC) .  This  task  required  retaining 
the  image  of  a  specified  configuration  despite  the  influence  of 
other  distracting  configurations  in  the  perceptual  field  (Moran 
&  Mefferd,  1959).  The  specific  configuration  was  given  in  this 
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task,  unlike  the  situation  for  Speed  of  Closure,  in  the  form  of 
36  geometric  figures  to  be  copied  onto  matrices  of  dots.  Ekstrom 
French,  Harmon,  and  Dermen  (1976)  placed  this  task  under  a  gen¬ 
eral  Closure,  Flexibility  of  (CF)  factor  together  with  Hidden 
Figures  and  Hidden  Patterns  tasks.  In  addition,  they  gave  a 
brief  review  and  26  references  in  a  format  followed  for  all  their 
referenced  factors.  The  FC  score  was  the  number  of  figures 
correctly  copied  in  180  seconds. 

Number  Facility  (NF) .  This  test  required  the  addition  of 
one  or  two  digit  numbers  in  sets  of  three  (Moran  &  Mef ferd, 

1959) .  Ekstrom,  et  al .  (1976)  placed  a  similar  task  under  a 
general  Number  Facility  (N)  factor  together  with  Division, 
Substraction,  and  Multiplication,  and  Addition  and  Subtraction 
Correction  Tests.  The  NF  test  score  was  number  of  corrert 
answers  in  180  seconds . 

Perceptual  Speed  (PS) .  This  task  required  the  crossing 
out  of  every  digit  that  was  like  one  circled  at  the  beginning 
of  that  row  in  a  row  of  30  digits  (Moran  &  Mefferd,  1959).  It 
appears  to  fall  under  the  general  Ekstrom,  et  al.  (1976)  Percep¬ 
tual  Speed  (P)  factor  which  was  Identified  by  Finding  A*s,  Number 
Comparison,  and  Identical  Picture  Tests.  Ekstrom,  et  al.  also 
provided  70  references  to  studies  identifying  Perceptual  Speed. 

The  score  of  the  Moran  and  Mefferd  task  was  the  number  of  digits 
correctly  marked  in  150  seconds. 

Speed  of  Closure  (SC) .  This  task  required  the  search  for 
simple  four-letter  words  imbedded  In  fields  of  random  letters 
which  did  not  form  unintended  words  (Moran  &  Mefferd,  1959). 

Words  were  mainly  nouns,  with  proper  names,  foreign,  and  plural 
words  excluded.  As  opposed  to  the  Flexibility  of  Closure  task 
described  earlier,  fore  knowledge  of  the  material  to  be  searched 
was  not  given.  Ekstrom,  et  al .  (1976)  placed  this  task  in  a 
general  Closure,  Verbal  (CV)  factor  which  was  identified  by 
Scrambled  Words,  Hidden  Words,  and  Incomplete  Words.  The  SC 
score  was  number  of  words  correctly  circled  in  150  seconds. 

Vizualization  (V) .  This  task  required  the  visual  follow¬ 
ing  of  the  path  of  a  line,  from  left  to  right,  and  placing  the 
line  numbers  in  the  appropriate  cell  on  the  right  (Moran  & 
Mefferd,  1959).  Sets  of  10  "tangled  lines"  constituted  the 
stimulus  material.  Comparison  of  this  task  with  the  factor  ref¬ 
erence  cognitive  tests,  identified  by  Ekstrom,  et  al.  (1976), 
suggested  that  this  test  was  more  related  to  their  Spatial 
Scanning  (SS)  Factor.  Maze  Tracing  Speed,  Choosing  a  Path,  and 
Map  Planning  tests  identified  the  Ekstrom,  et  al .  SS  Factor 
which  was  defined  as  "Speed  in  exploring  visually  a  wide  or 
complicated  spatial  field".  Scoring  on  the  Vizualization  (V) 

Test  was  the  number  of  cells  correctly  numbered  in  180  seconds. 

Subjects 

The  subjects  employed  in  this  experiment  were  18  volunteers  from 
a  population  of  enlisted  men,  ages  19  to  24,  assigned  to  this  laboratory 
as  full-time  research  subjects.  All  volunteers  %iere  recruited,  evaluated, 
and  employed  in  accordance  with  procedures  specified  in  Secretary  of  the 
Navy  Instruction  3900.39  Series  and  Bureau  of  Medicine  and  Surgery  Instruc- 
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tion  3900.6  Series.  These  Instructions  are  based  on  voluntary  Informed 
consent  and  meet  provisions  of  prevailing  national  and  International  guide¬ 
lines.  Volunteers  vere  given  cardiovascular,  pulmonary,  skeletal,  and 
other  examinations  to  Insure  their  capability  to  serve  In  possibly  hazardous 
environmental  research;  however,  they  were  generally  representative  of  the 
enlisted  population  in  intelligence.  A  description  of  the  volunteer  quali¬ 
fication  procedures  appears  In  Thomas,  Majewskl,  Ewing,  and  Gilbert  (1978). 
described  above.  They  had  some  exposure  to  psychological  testing,  mostly 
psychomotor,  but  had  no  previous  exposure  to  the  Moran  test  battery. 

Procedure 


Prior  to  practice  and  testing,  subjects  were  briefed  on  the  tasks  In 
the  experiment.  A  f amllarlzatlon  practice  trial  was  given  on  the  following 
day  on  all  tasks.  Responses  were  checked  to  Insure  task  understanding. 
Formal  testing  was  then  conducted  for  13  work  days  (Monday  thru  Friday) 
with  one  trial  per  day  on  each  task  given  In  the  order  PS,  SC,  NF,  FC, 
and  V.  Trials  were  conducted  on  separate  days  to  avoid  Inflation  of 
correlations  by  within  day  autocorrelative  effects  (Thorndike,  1949). 

Experiment  2;  Computer  Generated  Battery 


Tasks 


The  computer  battery  employed  in  this  study  consisted  of  four  paper- 
and-pencll  tasks  with  their  items  randomly  sampled  by  computer  from  among 
all  Items  of  their  type  (Carter  &  Sbxsa,  1981).  Because  of^|he  method 
of  generation,  a  very  large  number  of  alternate  forms  (>10^  )  may  be  pro¬ 
duced  along  with  Instruction  sheets,  practice  problems,  and  answer  sheets. 
Tasks  Included  In  this  study  were  Vertical  Addition  (Nv),  Horizontal  Add¬ 
ition  (Nh)  ,  Number  Comparison  (Nc)  and  Grammatical  Reasoning  (6R). 

Vertical  Addition  (Nv) .  This  task  required  the  addition 
of  three  two-digit  numbers  arrayed  vertically.  Conceptually, 

Nv  was  based  on  the  (vertical)  Addition  Test  described  by 
Ekstrom,  et  al .  (1976);  however,  they  used  three  one-  or  two- 
digit  numbers.  The  conceptual  basis  of  this  task  implied  that 
it  would  fall  under  their  Number  Facility  (N)  Factor,  The  Nv 
score  was  the  number  of  correct  responses  during  two  consecutive 
120  second  administrations. 

Horizontal  Addition  (Nh) .  Th is  task  requ Ired  the  addi¬ 
tion  of  three  three-digit  numbers,  ranging  from  100  through 
999,  arranged  horizontally.  It  was  suggested  by  a  task  anployed 
by  Alluisl  (1969,  pp.  68-69)  and  was  employed  to  see  if  the 
format  altered  the  differential  properties  of  the  test.  It  was 
suspected  that  a  substantial  portion  of  this  task  would  fall 
under  Number  Facility  (N)  as  described  by  Ekstrom,  et  al  •  (1976). 

The  Nh  score  was  the  number  of  correct  responses  during  two  con¬ 
secutive  120  second  administrations. 

Number  Comparison  (Nc) .  This  task  required  the  comparison 
of  two ,  3  to  9  digit,  horizontally  arranged  numbers  and  a 

response  of  S  (Same)  or  D  (Different).  Modeled  after  the  Number 
Comparison  Test  given  In  Ekstrom,  et  al.,  (1976),  It  would  be 
expected  to  fall  under  their  Perceptual  Speed  (?)  Factor.  The 
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Nc  score  was  the  number  of  correct  minus  number  of  errors  for 
180  seconds  administration. 

Grammatical  Reasoning  (GR) .  This  task  was  based  on 
Baddeley*s  three  minute  reasoning  test  (1968)  and  required  the 
comparison  of  a  statement  on  the  order  of  two  letters  (A  and  B) 
with  a  displayed  order.  For  example,  ”A  follows  B:  BA"  or  "A 
is  not  preceeded  by  B:  BA".  The  T  (True)  or  F  (False)  responses 
were  required.  Thirty-two  items  constitute  this  test  using 
affirmative  or  negative  phrasing,  active  or  passive  voice,  A  or 
B  mentioned  first,  the  verbs  "precedes"  or  "follows"  and  validity 
(T  of  F)  of  the  comparisons.  Different  random  item  orders  con¬ 
stituted  different  forms  of  this  task.  Grammatical  Reasoning 
appeared  to  be  related  to  the  Ekstrom,  et  al .  (1976)  Reasoning, 
Logical  (RL)  Factor.  The  RL  factor  is  defined  as  "the  ability 
to  reason  from  premise  to  conclusion,  or  to  evaluate  the  correct¬ 
ness  of  a  conclusion".  The  GR  score  was  the  number  of  correct 
minus  wrong  responses  made  in  90  seconds. 

Subjects  and  Procedure 

The  subjects  employed  for  this  experiment  were  17  volunteers  from  the 
general  population  described  in  Experiment  1.  Of  these  subjects,  12  had 
previously  been  tested  in  Experiment  1,  All  subjects  had  previous  psycho¬ 
logical  testing  exposure,  primarily  psychomotor.  Subjects,  subsequent  to 
briefing,  were  tested  for  15  work  days  on  the  battery  tasks  administered 
in  random  order. 


RESULTS 

The  results  of  the  two  experiments  were  analyzed  in  three  phases.  In 
the  first  two  phases,  the  Moran  and  Computer  Batteries  were  individually 
analysed.  The  final  phase  of  the  analysis  explored  the  differential  rela¬ 
tionships  between  the  two  batteries. 

Experiment  1;  Moran  Battery 

The  analysis  of  tasks  was  conducted  in  two  stages  focused  sequentially 
on:  (1)  task  differential  stabilities  and  cross  correlations;  and  (2)  stabil¬ 
ity  of  means  (linearity)  and  variances  (homogeneity)  over  days.  These  will 
be  taken  up  in  turn,  with  tasks  considered  in  the  order:  FC,  NF,  PS,  SC,  and 
V. 

Differential  Stability  and  Cross  Correlations 

Determination  of  the  point  in  practice  at  which  each  task  became  differ¬ 
entially  stable  was  accomplished  using  the  methodology  and  general  computer 
program  developed  by  Steiger  (1980a,  1980b).  As  a  first  step  for  each  task, 
the  constancy  of  the  reliabilities  over  Days  1-13  was  assessed.  Failing  a 
clearly  nonsignlf leant  (£>.]0)  result,  a  second  analysis  was  conducted  over 
Days  2-13  and  significance  was  evaluated.  Successive  analyses  were  con¬ 
tinued,  dropping  leading  days,  until  Indications  of  differential  stability 
were  obtained.  Task  analyses  are  given  below  and  cross  correlations  of  sta¬ 
bilized  tasks  are  given  subsequently. 
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Flexibility  of  Closure  (FC)>  The  differential  stability  test  across 
all  days  (1  -  13)  was  very  highly  significant  (X^(77)  »  121.1;  £<.0012), 
indicating  changing  correlations.  However,  after  dropping  the  first  two 
days,  the  test  yielded  nonsignificant  results  with  ^^(54)  *  65.8  (£>.13). 
The  estimate  of  the  FC  differentially  stable  reliability  across  Days  3-13 
was  r  »  0.882. 


Number  Facility  (NF) .  Across  all  days  (1  -  13),  the  test  statisticQ 
indicated  changing  cross  day  reliabilities  with  ^^(77)  »  201.4  (£<  10  )- 

The  Days  9-13  statistic,  however,  was  nonsignificant  with  X^(9)  =  14.5 
(£>.10).  The  estimate  of  the  NF  differentially  stable  reliability  across 
Days  9-13  was  £  =»  0.830. 

Perceptual  Speed  (PS).  The  stability  test  across  Days  1-13  yielded 
very  highly  significant  results  (  ^^(77)  *  135.6;  £<.0001).  However, 
after  dropping  the  first  six  days,  the  results  were  clearly  nonsignificant 
(;t"  (20)  »  23.3;  £>.27).  The  PS  estimated  differentially  stable  relia¬ 
bility  across  Days  7-13  was  jr  =  0.837. 

Speed  of  Closure  (SC).  Across  all  days  (1  -  13),  the  test  statistic 
indicated  changing  cross  day  reliabilities  (^^(77)  *  97.8;  £—0.06).  The 
test  statistic  after  dropping  the  first  day,  however,  was  clearly  nonsig¬ 
nificant  with  ^^(65)  =  97.8;  £>.19.  The  Days  2  -  13  SC  differentially 
stable  reliability  was  estimated  as  £  =  0.767. 

Visualization  (V) .  The  differential  stability  test  across  Days  1-13 
yielded  significant  results  with  ^^(77)  *  104.0  (£<.022).  However,  after 
dropping  the  first  5  days,  the  test  statistic  was  nonsignificant  (%^(27)  » 
36.7;  £>.10).  The  Days  6-13  differentially  stable  reliability  was  esti¬ 
mated  as  £  =  0.664. 

Table  1  gives  estimates  of  cross-task  correlations  over  respective 
differentially  stable  days  as  identified  above.  Paralleling  the  cross  day 
reliability  estimates,  task  correlations  were  averaged  over  all  pairs  of 
stable  task  days,  excepting  pairs  measured  on  the  same  day.  Withln-day 
correlations  were  not  included  so  as  to  avoid  inflation  with  within-day 
state  covariations  shown  for  some  of  the  Moran  Battery  tasks  by  Horn  (1972, 
p.  178).  The  estimated  NF-V  correlation  of  £  ®  0.118,  for  example,  was  the 
Flsher-z  average  of  the  35  cross  correlations  between  their  respective 
stable  Days  9-13  and  6-13.  Pertinently,  the  averaged  cross-day  correl¬ 
ation  estimates  of  either  differentially  stable  reliabilities  or  cross-task 
correlations  have  been  shown  to  be  markedly  less  variable  than  single 
estimates  (Bittner,  1981b).  Table  1  also  summarizes  the  estimates  of  task 
reliabilities  and  provides  corrected-for-attenuation  estimates  of  cross 
correlations. 

Analyses  of  Means  and  Variances 

Analysis  of  the  points  at  which  means  and  variances  became  stable  was 
accomplished  respectively  by:  BMDP2V  (Dixon  &  Brown,  1977);  and  finax  (Winer 
1971)  and  regression  statistical  tests.  Figure  1  shows  means  over  days  and 
was  a  guide  for  the  analyses*  Examining  this  figure,  it  can  be  seen  that, 
generally,  performance  improved  with  practice  on  all  tasks,  indicating 
learning • 
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Table  1.  Moran  Battery  Differentially 

Stabilized 

Cross-Day 

A 

Correlations 

Test 

1 

2 

3 

4 

5 

1)  Flexibility 
of  Closure 

(0.882) 

0.124 

-.176 

-.050 

0.353 

2)  Number 

Facility 

0.145 

(0.830) 

0.435 

0.567 

0.118 

3)  Perceptual 

Speed 

-.201 

0.511 

(0.873) 

0.603 

0.028 

4)  Speed  of 

Closure 

-.061 

0.711 

0.737 

(0.767) 

-.050 

5)  Visualization 

0.461 

0.159 

0.037 

-.070 

(0.664) 

* 

Correlations  above,  reliablities  along,  and  corrected-for-attenuation 
estimates  below  diagonal. 

Flexibility  of 

Closure(FC) 

.  Examining  the  FC 

plot,  it 

can  be  seen 

that  performance  appears  to  Improve  most  rapidly  over  the  first  two  days 
and  to  be  essentially  linear  thereafter,  ANOVA  over  all  days  (1  -  13)  was 
significant  (F(12,204)  =  6.93,  )  with  significant  nonlinear  trends 

(_F(1 1,204)  =  2.72,  £<.003).  After  dropping  the  first  two  days,  the  signif¬ 
icant  overall  effect  (F^(10,170)  =  11.04;  p<10  )  was  dominated  by  the  linear 

component  (_F(1,17)  =  14.40;  p<  .001)  with  all  nonlinear  components  nonsig¬ 
nificant  (p  >.05).  The  linear  component  accounted  for  54.7%  of  the  Days 
3-13  trend.  The  variances  were  homogeneous  over  all  days  with  P^ax  (13,17) 
-  2.13  (£>.10);  the  standard-deviation  across  days  was  estimated  to  be  5.92. 
Overall,  means  and  variances  were  stable  from  Day  3  onward. 

Number  Facility  (NF)  .  Figure  1  shows  that  NF  performance  generally 
increases  with  practice  and  appears  unchanging-after  Day  8.  ANOVA  over  all 
days  (1  -  13)  yielded  _F(12,204)  =  6.36  (£<10  )  with  significant  nonlinear 

trends  F^(l  1,204)  *  3.05;  £<.001.  After  dropping  the  first  8  days,  the 
overall  F^(4,68)  ®  1.14  was  nonsignificant  (£>.34).  Although  the  variances 
were  nonsignif icantly  heterogeneous  across  all  days  (Fmax(13,17)  *=  3.12; 
£>.05),  It  was  observed  that  the  values  for  the  first  two  days  ranked 
respectively  lovrest  and  next  lowest  with  subsequent  days  appearing  near 
asymptotic.  The  correlation  between  logarithmic  (log)  transformed  variances 
and  test-day  number  was  jr  *  0.538  (£<.06).  Dropping  the  first  two  days, 
the  correlation  dropped  to  0.132  (£>.69)  and  the  standard  deviation  was 
conservatively  estimated  as  10.35.  The  means  and  standard  deviations  were 
jointly  stable  after  Day  8. 
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Perceptual  Speed  (PS) .  Figure  1  shows  that  PS  mean,  in  addition  to  a 
slight  overall  increase,  appeared  to  fluctuate  over  the  course  of  the  first 
eight  days.  ANOVA  over  all  days  (1  -  13)  yielded  F(12,204)  g  9.60;  £<10 
with  significant  nonlinear  trends  (£(11,204)  *  10.37;  £<10  )  accounting 

for  82%  of  the  variations.  After  dropping  the  first  eight  days,  the  means 
became  stable  and  level  (£(4,68)  =  0.57;  £>.68)  with  a  mean  of  101.7.  The 
variances  over  all  days,  although  appearing  somewhat  unstable,  yielded  a 
nonsignificant  £max  (13,17)  =  4.34;  (£>.05)  and  a  nonsignificant  trend 
test  when  log  transformed  variances  were  correlated  with  day  number  (r_  = 
0.444;  £  =  .128),  Overall,  means  and  variances  were  jointly  stable  only 

over  Days  9  -  13, 

Speed  of  Closure  (SC) .  Figure  1  gives  the  SC  means  which  generally 
show  increasing  performance  with  "cyclic"  nonlinear  trends.  ANOVA  over  all 
days  (1  -  13)  yielded  £(12,204)  *  22.40  (£<10  ^)  with  the  nonlinear  trend 
components  significant  (£(11,204  =  5.98;  £<10  ).  Only  after  dropping  the 

first  10  days  do  the  means  appear  stable  and  level  (^(2,204)  »  2.00;  £>  .15), 
The  Days  11  -  13  mean  was  44.85.  The  variances  across  all  days  (1  ~  13) 
were  homogenous  (£max(13,17)  *  2.14;  £>.10)  with  an  estimated  standard 
deviation  of  7.92.  Jointly,  the  means  and  variances  were  apparently  stable 
only  across  Days  11  -  13. 

V isual iza t ion  ( V) .  Figure  1  shows  that  V  performance  increased  over 
days  with  apparently  higher  order  nonlinear  trends.  ANOVA  over  all  (1  -  13) 
Days  was  significant  (£(12,204)  =  14.801;  £<10  )  with  significant  non¬ 

linear  trends  (£(11,204)  =  6.30;  £<10”^).  The  nonlinear  trends  continued 
in  the  means  even  over  the  last  three  days  where  the  nonlinear  component 
was  still  significant  (£(1,204)  =  9.71;  £<.003).  The  standard  deviations 
across  all  days  (1  -  13)  appeared  to  be  negatively  related  to  day  number  with 
the  largest  (10.00)  on  Day  1  and  the  smallest  (6.12)  on  Day  13.  This  trend 
was  confirmed  by  the  significant  (£<  .006)  correlation,  £  -  0.710,  between 
day  and  log  transformed  variance.  This  variance  trend  appeared  to  continue, 
although  not  significant  (£  >  .05),  over  the  last  three  days  £  *=  -  .532, 

Hence,  conservatively,  neither  V  mean  nor  variance  appeared  stable  even  over 
the  last  three  days. 


Experiment  2;  Computer  Battery 

Analysis  of  tasks  was  conducted  in  three  stages  dealing  sequentially 
with  correlation,  variance,  and  mean  stability  as  in  the  first  experiment. 
Tasks  were  considered  in  the  order:  Nv,  Nh,  Nc  and  GR. 

Differential  Stability  and  Cross  Correlations 

Determination  of  the  point  in  practice  at  which  that  each  task  became 
differentially  stable  was  accomplished  using  the  Steiger  (1980a,  1980b) 
based  methodology  employed  for  the  first  experiment.  Task  analyses  are 
described  below  and  cross  correlations  of  stabilized  tasks  are  given  sub¬ 
sequently. 

Vertical  Addition  (Nv) .  The  Steiger  differential  stablity  test  across 
all  (1-15)  days  indicated  changing  reliabilities  5f^(104)  *  126.5  <£<.07). 
However,  after  dropping  the  first  two  days,  the  test  statistic  became  non¬ 
significant  (^  (77)  -  90.8;  £>.13).  The  Days  3  -  15  Nv  differentially 
stable  reliability  was  estimated  as  r  «  0.921. 
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Horizontal  Addition  (Nh) .  Indicating  stability  from  the  first  day, 
the  Nh  test  statistic  across  all  days  was  nonsignificant  wlth^^(104)  * 

105.4  (£>.44).  The  Days  1  -  15  Nh  differentially  stable  reliability  was 
estimated  as  £  »  0.785. 

Number  Comparison  (NC) .  The  NC  Steiger  analysis  across  all  days  (1  - 
15)  was  very  highly  significant  (^^(104)  =  142.5;  £<.008).  However,  after 
dropping  the  first  two  days,  the  test  statistic  became  nonsignificant  (9C^(77) 

»  88.8;  £>.16).  The  Days  3  -  15  NC  differentially  stable  reliability  estimate 
was  £  =  0.766. 

Grammatical  Reasoning  (GR) .  The  GR  test  statistic  over  all  days  (1  - 
15)  was  found  significant  (^^(104)  =  129.7;  £<.05).  After  dropping  the 
first  four  days,  the  statistic  became  nonsignificant  with  ‘%^(54)  =  64.4 
(£>.15).  The  Days  6-15  differentially  stable  reliability  estimate  was 
£  =  0.874. 

Table  2  gives  estimates  of  task  cross  correlations  over  respective 
differentially  stable  days  as  identified  above.  Table  2  also  summarized 
the  estimates  of  task  reliabilities  and  provides  corrected-for-attenuation 
estimates  of  cross  correlations. 

Analyses  of  Means  and  Variances 

Figure  2  shows  mean  performances  over  days  and  was  a  guide  for  the 
analyses.  Examining  this  figure,  it  can  be  seen  that,  generally,  perform¬ 
ance  increased  with  practice  on  all  tasks  which  indicated  learning.  In  the 
following  Vertical  Addition  (Nv),  Horizontal  Addition  (Nh),  Number  Comparison 
(NC)  and  Grammatical  Reasoning  (GR)  will  be  considered  in  turn. 

Vertical  Addition  (Nv) .  Mean  Nv  performance,  with  the  exceptions  of 
irregularities  at  Days  4  and  Days  11  -  12,  appears  linear  subsequent  to  Day 
2.  ANOVA  over  all  days  (1  -  15)  reveals  a  significant  effect  ^(14,224)  = 

9.12  (£<10  )  with  both  linear  (F(l,16)  ^  23.79;  £<.0002)  and  nonlinear 

components  (F(13,224)  *  3.60;  £<■  ,0001)  clearly  significant.  After  dropping 
Days  1-4,  ANOV^  over  the  remaining  days  (5  -  15)  was  significant  (F(10,160) 

*  5.29;  £<  10  °)  with  the  linear  component  significant,  explaining  59%  of 
the  variance,  and  the  nonlinear  component  also  significant  (F(9,160)  *  2.40; 
£<.02).  Significant  nonlinear  components  manifested  themselves  until  after 
dropping  Days  1-11  where  the  overall  ANOVA  was  nonsignificant  (F(3,48)  =  2.31; 
£>.088).  Hence,  level  and  stable  means  are  indicated  only  after  dropping 
Days  1  -  11.  The  omnibus  Fmax  (15,16)  =  2.48;  £>.l)  was  nonsignif leant 
however,  the  correlation  between  log  variance  and  day  was  £  =  0.894  (£<10  ). 

After  dropping  Days  1-8,  this  correlation  dropped  to  0.442  and  after  dropping 
Days  1-9,  the  correlation  was  0.274  (£>.599).  The  Days  10-15  estimated 
standard  deviation  was  8.00.  Altogether  Nv  means  and  variances  were  jointly 
stable  after  Day  11. 

Horizontal  Addition  (Nh) .  Figure  2  shows  Nh  mean  performance  increasing 
over  the  first  three  days  with  apparently  level  performance  thereafter. 

ANOVA  across  all  days  (1  -  15)  yielded  a  significant  effect  F(14,224)  «  5.94; 
£<10  )  with  significant  nonlinear  trends  (£(13,224)  *  2.40;  £<.005). 

Dropping  the  first  three  days,  the  overall  £(11,176)  *  1.04  was  clearly  non¬ 
significant  (£>0.41),  supporting  the  view  of  level  performance  over  Days 
4-15.  The  omnibus  £nax  (15,16)  *3.17  was  nonsignificant  (£>.05),  but  a 
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Table  2.  Computer  Battery  Differentially  Stabilized 
Cross  Day  Correlations* 


Test 

1 

2 

3 

4 

1)  Vertical 
Addition 

(0.921) 

0.790 

0.606 

0.209 

2)  Horizontal 
Addition 

0.929 

(0.785) 

0.643 

0.261 

3)  Number 
Comparison 

0.721 

0.829 

(0.766) 

0.259 

4)  Grammatical 
Reasoning 

0.232 

0.315 

0.317 

(0.874) 

Correlations  above,  reliabilities  along,  and  corrected-for-attenuation 
estimates  below  diagonal . 


trend  of  increasing  variances  with  days  was  apparent.  T^e  correlation  of 
log  transformed  variances  and  days  was  jr  *  0.867  (£^10  ).  This  trend  for 

correlated  log  variances  and  days  continued  over  Days  13  -  15  with  r  *  0.999, 
(£<.03).  Altogether,  the  Nh  means  are  stable  after  three  days,  but  variances 
appear  to  be  increasing  across  all  days. 

Number  Comparison  (NC) .  Figure  2  shows  NC  performance  increasing  non- 
linearly  over  the  first  6  days  and  maintaining  a  level  thereafter.  ^OVA 
over  all  days  (1  -  15)  yielded  a  significant  F(14,224)  *  3.88  (^<10  )  with 

significant  F(13,224)  »  2.00  (£<.022)  nonlinear  trends.  However,  after 
dropping  the  first  6  days,  the  overall  F(8,128)  *  0.51  was  clearly  nonsig¬ 
nificant  (£>  .84)  and  confirmed  the  graphical  impression.  Although  the  var¬ 
iances  were  nonsignif icantly  heterogeneous  across  days  by  the  omnibus  Fmax 
(15,16)  »  4.21  (£>.05),  the  correlations  of  log  transformed  variance  and 
day  number  was  r  ■  0.777  (£<  .0007).  This  trend  of  the  variances  appeared 
present  until  after  dropping  Days  1-9  where  r  «  0.136  was  nonsignificant 
(£>.79)  with  an  estimated  standard  deviation  of  16.65.  Altogether  NC  means 
and  variances  were  Jointly  stable  after  Day  9. 

Grammatical  Reasoning  (GR) .  Mean  performance  for  GR  appears  to  increase 
with  negative  acceleration  across  all  days  (1  -  15).  Confirming  this  view, 
ANOVA  across  all  days  revealed  significant  linear  (F(l,16)  «  20.49;  £<.0005) 
and  quadratic  (F(l,16)  •  4*88;  £<*045)  trends^.  Over  Days  2  -  15,  the  overall 
ANOVA  was  significant  (F(13,208)  •  3.80;  £<10*  ).  The  linear  component 
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accounted  for  68.2%  of  the  sum  of  squares  (F^(l,16)  =  15.47  (£<.001),  with 
nonsignificant  nonlinear  trends  (£(12,208)  *  1.31;  £>.21).  The  omnibus 
Fmax(15,16)  »  4.05  (£>.05)  was  nonsignificant;  however,  the  correlation 
between  log  variance  and  day  was  _r  =  0.541  (£<.037).  After  dropping  the 
first  two  days,  the  correlation  was  £  -  0.185  which  was  nonsignificant 

.54).  Hence,  over  Days  3-15  both  GR  means  and  variances  were  stable. 

Moran  and  Computer  Battery  Differential  Relationship 


Table  3  gives  estimates  of  cross-task  correlations  over  differentially 
stable  days  identified  In  the  analyses  of  the  individual  batteries.  Based 
upon  12  subjects  common  to  both  the  Computer  and  Moran  Battery  studies,  the 
pattern  and  range  of  values  (£  =  -0.50  to  0.84)  suggested  several  common 
factors.  The  factor  structure  was  explored  by  factor  analysis. 

An  iterated  principal-factor  analysis  (PFA)  was  performed  on  the  cor¬ 
relation  matrix  of  the  two  batteries  using  BMDP4M  (Dixon  &  Brown,  1977). 
Subsequent  to  identification  of  three  factors  with  eigen  values  greater  than 
unity  by  Principal  Components  Analysis,  PFAs  were  run  sequentially  using  the 
commonality  estimates  resulting  from  each  analysis  as  input  to  the  next  anal¬ 
ysis.  The  sequence  of  PFAs  ms  continued  until  the  maximum  commonality  change 
was  less  than  0.005.  This  procedure,  it  is  noteworthy,  yields  results  equi¬ 
valent  to  those  obtained  by  MINRES  Analysis  (Harmon,  1976).  Table  4,  in 
addition  to  commonality,  gives  the  factor  loadings  subsequent  to  Varimax 
Rotation. 

Table  4  shows  that  Factor  1  (FI)  explains  more  than  twice  the  variance 
of  Factors  2  and  3  (F2  and  F3).  Explaining  34.6%  of  the  possible  variation, 

FI  is  dominated  by  loadings  of  0.889  for  Nv,  0.858  for  NF,  0.819  for  Nh,  and 
0.712  for  NC.  The  heavy  loadings  for  the  three  arithmetic  tasks  indicate 
that  FI  is  related  to  the  Egstrom  et  al .  (1976)  Number  Facility  (N)  factor 
and  support  naming  it  "Number  Facility".  Factor  2  (F2)  explains  14.8%  of  the 


Table  4.  Rotated  Factor  Loadings  and  Commonalities 


Variable 

Factor  1 

Factor  2 

Factor  3 

Commonalities 

V 

1 

.148 

.878 

-.289 

.8764 

FC 

2 

.016 

.120 

-.473 

.2381 

SC 

3 

.471 

-.035 

.578 

.4811 

NF 

4 

.858 

-.022 

.076 

.7428 

PS 

5 

.303 

.240 

.753 

.7161 

NC 

6 

.712 

.062 

.028 

.5109 

6R 

7 

.269 

-.660 

-.222 

.5843 

VA 

8 

.889 

-.076 

.296 

.8839 

HA 

9 

.819 

-.128 

.119 

.7015 

VR 

3.112 

1.333 

1.300 
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possible  variation.  A  bipolar  factor,  F2  is  Identified  by  loadings  of  0.878 
for  V  and  -0.680  for  GR.  The  opposition  of  spatial  and  verbal  tasks  suggests 
that  this  may  represent  a  cognitive  style  factor  which  might  be  named 
"Reasoning  vs  Visualization".  The  last  factor  (F3)  accounted  for  14-1%  of 
the  possible  variation  and  Is  Identified  by  two  positive  loadings  with  0.733 
for  PS  and  0.518  for  SC.  The  failure  of  NC  to  appear  with  PS  on  this  factor 
contraindicates  the  association  of  this  variable  with  the  Egstrom,  et  al . 
(1976)  Perceptual  Speed  (P)  factor  which  was  identified  by  both  PS  and  NC. 

F3  will  be  named  "Perceptual  Speed  Task".  Altogether,  the  three  factors 
explain  63.8%  of  the  possible  obtainable  variance. 

DISCUSSION 

This  Investigation  was  directed  at  the  evaluation  of  tasks  for  repeated 
measures  application  to  environmental  Investigations.  Drawn  from  the  Moran 
(Moran  &  Mefferd,  1959)  and  Computer  (Carter  &  Sblsa,  1981)  Batteries,  nine 
tasks  were  examined  with  respect  to  the  points  In  practice  at  which  that 
they  obtained  unchanging  or  linearly  changing  means,  homogeneous  variances, 
and  constant  (differentially  stable)  intertrial  correlations.  The  relation¬ 
ships  between  tasks,  subsequent  to  differential  stabilization,  were  also 
explored  by  factor  analysis.  The  factor  analysis  and  other  results  provide 
a  basis  for  task  evaluations.  Task  evaluations,  comparison  with  previous 
studies,  a  consideration  for  future  studies,  and  conclusions  will  be  offered 
in  the  following  sections. 

Task  Evaluations 


Table  5  abstracts  task  characteristics  revealed  by  earlier  analyses. 
Examining  this  table.  It  can  be  noted  that  tasks  are  organized  Into  four 
groups  along  lines  suggested  by  the  factor  analysis.  Number  Facility  (NF) , 
Vertical  Addition  (Nv),  Horizontal  Addition  (Nh) ,  and  Number  Comparison 
(NC)  constitute  the  first  group  which  was  identified  with  the  first  factor 
(FI).  The  second  group  is  composed  of  Visualization  (V)  and  Grammatical 
Reasoning  (GR)  which  were  Identified  with  the  second  factor  (F2) .  The 
third  group  Is  made  up  of  Perceptual  Speed  (PS)  and  Speed  of  Closure  (SC) 
measures  ^Ich  were  identified  with  the  third  factor  (F3)  .  A  fourth  and 
last  group  Is  made  up  of  the  single  Flexibility  of  Closure  (FC)  task.  The 
FC  task  had  low  commonality  with  other  tasks  (.24),  substantial  stable  reli¬ 
ability  (0.88),  and  therefore  substantial  reliable  "specificity".  This 
specificity  suggests  defining  FC  as  a  separate  factor  with  its  loading  equi¬ 
valent  to  its  reliability  (0.88).  For  each  group,  the  task  loadings,  stable 
periods  for  statistical  measures,  and  stabilized  reliabilities  are  also 
given.  The  factor  groups  provide  collections  of  tasks  ^ich  may  be  eval¬ 
uated  together  using  their  abstracted  characteristics.  Evaluations,  given 
below,  will  follow  group  organization. 

Factor  1  Group.  Nv  has  both  greatest  reliability  and  factor 
loading  of  the  members  of  this  group.  It  evidences  differential 
stability  over  Days  3-15  and  is  surpassed  only  by  Nh,  Which 
had  unstable  variances.  Both  NF  and  NC  appeared  to  obtain  sta¬ 
bility  of  means  and  variances  slightly  earlier  In  training,  but 
both  involve  180  second  trials  vice  120  seconds  for  Nv. 
Altogether  Nv  appears  the  choice  from  this  group  of  tasks. 
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Factor  2  Groups  GR  is  the  only  member  of  this  bipolar 
factor  group  which  exhibits  stability  of  means  and  variances. 

This  is  unfortunate  as  the  opposition  of  spatial  (V)  and  verbal 
(GR)  tasks  suggests  a  "cognitive  style"  factor  as  noted  earlier. 

GR  can  be  recommended  as  the  only  stable  member  of  this  group. 

Factor  3  Group.  PS  has  the  largest  loading  reliability  and 
earliest  stabilization  of  means  and  variances.  SC  obtains  dif¬ 
ferential  stability  earlier,  but  obtains  overall  stability  later 
than  PS.  Hence,  PS  can  be  recommended  as  the  representative  of 
this  group. 

Factor  4  Group .  FC  is  the  only  member  of  this  group  and 
is  recommended  by  its  overall  stability. 

Overall,  the  Vertical  Addition  (Nv)  ,  Perceptual  Speed  (PS),  Grammatical 
Reasoning  (GR) ,  and  Flexibility  of  Closure  (FC)  tasks  may  be  recoraroended  as 
the  result  of  the  evaluation. 

Comparison  with  Previous  Studies 

The  results  from  the  current  study  may  be  compared  with  earlier  Invest¬ 
igations  employing  the  same  paradigm  (Seales,  et  al. ,  1980;  Carter,  et  al . , 
1981).  Seales,  et  al .  reported  on  a  10  mlnuve  *  <-)St,  a  sequence  of  four 
arithmetic  operation  subtasks  (addition,  sibtraction,  multiplication,  and 
division),  which  appeared  to  meet  all  stability  criteria  from  the  first  day 
with  a  reliability  of  0.941.  In  the  presi^nt  investigation,  the  NF,  Nv,  and 
Nh  arithmetic  tasks  involved  only  the  addition  operation  and  respectively 
were  3,  2,  and  2  minutes  in  duration  with  "“^liabilities  of  0.83,  0.92,  and 
0.79.  It  might  be  expected  that  the  .  lities  of  the  NF,  Nv ,  and  Nh  tasks 

would  have  been  of  the  order  of  0.83  for  a  three  minute  task  and  0.76  for  a 
two  minute  task  based  on  the  Seales,  et  al .  results  and  the  Spearman-Brown 
Formula  (Winer,  1971).  Only  the  Nv  res^^^ilts  are  out  of  line  with  these  esti¬ 
mates  (£<.05)  with  a  greater  than  expected  reliability  of  0.92.  The  required 
periods  for  overall  stabilization  in  the  present  investigation  appear  somewhat 
excessive,  initially,  but  shortening  the  task  length  by  factors  of  3  to  5  may 
provide  better  assessment  of  the  transitions  to  stability  than  the  10  minute 
block  of  the  Seales,  et  al .  (1980)  task.  The  present  investigation,  in  addi¬ 
tion,  employed  more  sophisticated  statistical  methodologies  than  Seales,  et 
al . ,  (1980).  In  any  case,  the  present  investigation  indicates  that  the  NF 
and  Nv  tasks  require  only  24  and  22  minutes  of  practice  before  they  would  be 
suitable  for  repeated  measures  applications. 

The  results  of  Carter,  et  al.  (1981)  are  comparable  with  the  current 
results  for  the  Grammatical  Reasoning  (GR)  Task.  In  their  study.  Carter,  et 
al.  reported  that  GR  met  all  stability  criteria  after  four  daily  (60  second) 
administrations  with  a  stable  reliability  of  r  »  0.82.  This  Investigation 
found  that  all  stability  criteria  were  met  after  six  daily  (90  second)  admin¬ 
istrations  with  a  stable  reliability  of  £  **  0.87.  Perhaps  due  to  the  sharpened 
statistical  methodologies,  the  period  to  meet  all  criteria  was  again  somewhat 
lengthened  although  only  9  minutes  total  appears  necessary  even  in  the  present 
study.  The  reliability  of  £  -  0.87  found  in  the  present  study  is  exactly  what 
would  be  estimated  from  the  Carter,  et  al .  results  and  application  of  the 
Spearman-Brown  Formula  (Winer,  1971). 
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Altogether,  the  present  results  are  comparable  with  those  found  in  earlier 
studies  with  the  same  paradigm.  More  sensitive  statistical  methods  are  sus¬ 
pected  of  somewhat  extending  the  required  period  for  overall  stability;  however, 
required  practice  durations  were  not  practically  changed.  The  comparability  of 
Spearman-Bro%m  adjusted  reliabilities  across  investigations  supports  the  view 
that  the  tasks  are  differentially  stabilized,  as  such  stability  is  an  assump¬ 
tion  of  the  method.  The  stability  of  the  adjusted  stabilized  reliabilities 
recommends,  at  least  for  arithmetic  and  grammatical  reasoning  tasks,  estimation 
of  reliabilities  either  by  the  Spearman-Brown  Formula  or  by  graphical  methods 
(Bittner  &  Carter,  1981). 

A  Consideration  for  Future  Studies. 

The  present  investigation  evaluated  tasks  in  the  units  of  measurement 
employed  in  the  original  research  of  Moran  and  Mefferd,  (1959)  and  Carter  and 
Sbisa  (1981).  In  terms  of  numbers  accomplished  in  a  test  period,  the  task 
scores  are  typical  of  a  breadth  of  measures  used  to  assess  cognitive  abilities 
and  skills  (cf,  Ekstrom,  et  al*,  1976).  Transformations  of  scores  were  not 
examined  despite  repeated  evidence  suggesting  their  use.  Of  the  nine  measures 
examined,  six  (67%)  were  found  with  correlations  between  day  number  and  log- 
transformed  variances  (viz,  NF,  V,  Nv,  Nh,  NC,  GR)  and  two  (V  and  Nh)  were 
apparently  unstable  over  this  study's  duration.  Prediction  of  increasing 
variances  with  trials,  it  is  noteworthy,  has  been  made  by  Jones  (1972)  for  the 
class  of  tasks  exemplified  in  this  study.  This  prediction  may  be  made  from 
assumptions  that  (1)  learning  increases  the  rate  of  task  processing  and  (2) 
individuals  tend  to  retain  their  relative  differences.  The  increase  in  vari¬ 
ances  over  trials,  usually  paralleling  the  means,  suggests  scaling  trans¬ 
formations  with  negative  power  strengths  such  as  logarithmic,  square  root, 
etc.  (Tukey,  1957).  Other  recently  developed  statistical  methodology, 
involving  conjoint  measurement  (Cliff,  1973,  pp  475-476)  and  multidimensional- 
scaling  (Carroll  &  Arable,  1980,  pp  629-630),  might  also  prove  of  value  for 
linearization.  In  any  case,  the  selection  of  method  of  transformation  or 
scaling  is  an  empirical  one  which  would  require  examination  of  the  results 
in  terms  of  the  statistical  requirements  for  repeated  measures  applications. 
Consideration  of  the  use  of  transformation  and  scaling  methods  to  Improve  the 
behavior  of  task  scores  appears  desirable  in  future  evaluations. 

Conclusions 


Two  basic  conclusions  may  be  made  based  upon  the  results  of  this  investi¬ 
gation:  First,  Vertical  Addition  (Nv) ,  Perceptual  Speed  (PS),  Grammatical 
Reasoning  (GR)  and  Flexibility  of  Closure  (FC)  tasks  may  be  recommended  for 
repeated  measures  application  subsequent  to  sufficient  practice  for  stability. 
Second,  the  use  of  transformations  and  scaling  techniques  should  be  considered 
in  future  investigations  of  task  stabilization. 


) 
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